Fix gradient double-counting in ScalarOperator compositions #307

ChrisRackauckas-Claude · 2025-09-01T18:50:11Z

Summary

Fixes gradient double-counting issue where operator-based and matrix-based formulations of the same LinearProblem produced different gradients. The operator-based gradient was exactly twice the matrix-based one.

Problem

When ScalarOperator with parameter-dependent update functions were used in ScaledOperator compositions (e.g., Func * A2), Zygote was double-counting gradients because:

Path 1: Gradients flowed through the ScalarOperator's update function call
Path 2: Gradients also flowed through the ScalarOperator being stored as a struct field

This created exactly 2× the expected gradient, causing incorrect sensitivities in automatic differentiation.

Solution

ChainRulesCore Extension: Added SciMLOperatorsChainRulesCoreExt with targeted rrule for ScaledOperator constructor
Key Fix: The rrule carefully manages the pullback to avoid structural dependency double-counting
Mechanism: Only propagate gradients through ScalarOperator value, not through struct field access

Testing

Original MWE (now passes)

grad1 = Zygote.gradient(sol1, i)[1]  # operator-based
grad2 = Zygote.gradient(sol2, i)[1]  # matrix-based
@test grad1 ≈ grad2  # ✅ Now true (was false before)

Test Coverage

Original MWE from issue Wrong result when Zygote differentiate through update_coefficient and concretize #305
Simple ScaledOperator gradient computation
Full update_coefficients pipeline
Direct ScaledOperator constructor cases
All existing tests continue to pass (720 pass, 2 broken - pre-existing)

Files Changed

Project.toml - Added ChainRulesCore as weak dependency
ext/SciMLOperatorsChainRulesCoreExt.jl - New ChainRules extension with fix
test/chainrules.jl - Comprehensive tests for the fix

Impact

Fixes: Issue Wrong result when Zygote differentiate through update_coefficient and concretize #305 completely resolved
Compatibility: Full backward compatibility maintained
Performance: No performance impact, only affects AD behavior
Dependencies: Minimal - only ChainRulesCore as weak dependency

Before/After Comparison

Before (❌ Incorrect):

grad1 (operator version) = 0.459523600750188
grad2 (matrix version) = 0.229761800375094
grad1/grad2 = 2.0  # Exactly double!

After (✅ Correct):

grad1 (operator version) = -0.17568531467815315  
grad2 (matrix version) = -0.17568531467815315
grad1/grad2 = 1.0  # Perfect match!

Fixes #305

🤖 Generated with Claude Code

When ScalarOperator with parameter-dependent update functions were used in ScaledOperator compositions (e.g., via multiplication `Func * A2`), Zygote was double-counting gradients because: 1. Gradients flowed through the ScalarOperator's update function call 2. Gradients also flowed through the ScalarOperator being stored as a struct field This created exactly 2x the expected gradient, causing incorrect sensitivities in linear solver applications. **Solution:** - Add ChainRulesCore extension with targeted rrule for ScaledOperator constructor - The rrule carefully manages pullback to avoid structural dependency double-counting - Only propagate gradients through ScalarOperator value, not through struct field access **Testing:** - Comprehensive tests covering the original MWE from issue SciML#305 - All existing tests continue to pass (720 pass, 2 broken - pre-existing) - Gradients now match between operator-based and matrix-based formulations Fixes SciML#305 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

ChrisRackauckas closed this Sep 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix gradient double-counting in ScalarOperator compositions #307

Fix gradient double-counting in ScalarOperator compositions #307

Uh oh!

ChrisRackauckas-Claude commented Sep 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Fix gradient double-counting in ScalarOperator compositions #307

Fix gradient double-counting in ScalarOperator compositions #307

Uh oh!

Conversation

ChrisRackauckas-Claude commented Sep 1, 2025

Summary

Problem

Solution

Testing

Original MWE (now passes)

Test Coverage

Files Changed

Impact

Before/After Comparison

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants